A scoring function development for protein structure classification based on sequence and structure information
نویسنده
چکیده
This project is to combine sequence and structure information of proteins to set up scoring function to classify protein structures whose belongings are unknown. The scoring function is basically explained as distance between two proteins. We began this project with searching useful informations and construct geometric and topological representations and distance metrics of those useful informations. In this paper, we used sequence information, residual number information,and Amino Acid information. Then through learning from 40 classified proteins and linear programming, we try to determine the weights for each piece of informations. Ideally, those ”good” pieces of information are supposed to win higher weights and less good ones have lower weights. The scoring function with weights determined can be used to predict classification of other proteins whose structures are unknown.
منابع مشابه
Protein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches
DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...
متن کاملIn silico Analysis and Molecular Modeling of RNA Polymerase, Sigma S (RpoS) Protein in Pseudomonas aeruginosa PAO1
Background: Sigma factors are proteins that regulate transcription in bacteria. Sigma factors can be activated in response to different environmental conditions. The rpoS (RNA polymerase, sigma S) gene encodes sigma-38 (σ38, or RpoS), a 37.8 kDa protein in Pseudomonas aeruginosa (P. aeruginosa) strains. RpoS is a central regulator of the general stress response and operates in both retroa...
متن کاملApplication of a simple likelihood ratio approximant to protein sequence classification
MOTIVATION Likelihood ratio approximants (LRA) have been widely used for model comparison in statistics. The present study was undertaken in order to explore their utility as a scoring (ranking) function in the classification of protein sequences. RESULTS We used a simple LRA-based on the maximal similarity (or minimal distance) scores of the two top ranking sequence classes. The scoring meth...
متن کاملIn Silico Analysis of Primary Sequence and Tertiary Structure of Lepidium Draba Peroxidase
Peroxidase enzymes are vastly applicable in industry and diagnosiss. Recently, we introduced a new kind of peroxidase gene from Lepidium draba (LDP). According to protein multiple sequence alignment results, LDP had 93% similarity and 88.96% identity with horseradish peroxidase C1A (HRP C1A). In the current study we employed in silico tools to determine, to which group of peroxidase enzymes LDP...
متن کاملPersian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008